Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
On the causes, consequences, and avoidance of PCR duplicates: Towards a theory of library complexityAbstract Library preparation protocols for most sequencing technologies involve PCR amplification of the template DNA, which open the possibility that a given template DNA molecule is sequenced multiple times. Reads arising from this phenomenon, known as PCR duplicates, inflate the cost of sequencing and can jeopardize the reliability of affected experiments. Despite the pervasiveness of this artefact, our understanding of its causes and of its impact on downstream statistical analyses remains essentially empirical. Here, we develop a general quantitative model of amplification distortions in sequencing data sets, which we leverage to investigate the factors controlling the occurrence of PCR duplicates. We show that the PCR duplicate rate is determined primarily by the ratio between library complexity and sequencing depth, and that amplification noise (including in its dependence on the number of PCR cycles) only plays a secondary role for this artefact. We confirm our predictions using new and published RAD‐seq libraries and provide a method to estimate library complexity and amplification noise in any data set containing PCR duplicates. We discuss how amplification‐related artefacts impact downstream analyses, and in particular genotyping accuracy. The proposed framework unites the numerous observations made on PCR duplicates and will be useful to experimenters of all sequencing technologies where DNA availability is a concern.more » « less
-
OBJECTIVETo characterize high type 1 diabetes (T1D) genetic risk in a population where type 2 diabetes (T2D) predominates. RESEARCH DESIGN AND METHODSCharacteristics typically associated with T1D were assessed in 109,594 Million Veteran Program participants with adult-onset diabetes, 2011–2021, who had T1D genetic risk scores (GRS) defined as low (0 to <45%), medium (45 to <90%), high (90 to <95%), or highest (≥95%). RESULTST1D characteristics increased progressively with higher genetic risk (P < 0.001 for trend). A GRS ≥ 90% was more common with diabetes diagnoses before age 40 years, but 95% of those participants were diagnosed at age ≥40 years, and they resembled T2D in mean age (64.3 years) and BMI (32.3 kg/m2). Compared with the low risk group, the highest-risk group was more likely to have diabetic ketoacidosis (low 0.9% vs. highest GRS 3.7%), hypoglycemia prompting emergency visits (3.7% vs. 5.8%), outpatient plasma glucose <50 mg/dL (7.5% vs. 13.4%), a shorter median time to start insulin (3.5 vs. 1.4 years), use of a T1D diagnostic code (16.3% vs. 28.1%), low C-peptide levels if tested (1.8% vs. 32.4%), and glutamic acid decarboxylase antibodies (6.9% vs. 45.2%), all P < 0.001. CONCLUSIONSCharacteristics associated with T1D were increased with higher genetic risk, and especially with the top 10% of risk. However, the age and BMI of those participants resemble people with T2D, and a substantial proportion did not have diagnostic testing or use of T1D diagnostic codes. T1D genetic screening could be used to aid identification of adult-onset T1D in settings in which T2D predominates.more » « less
An official website of the United States government
